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Abstract — Many implementations for decoding LDPC codes 
are based on the (normalized/offset) min-sum algorithm due 
to its satisfactory performance and simplicity in operations. 
Usually, each iteration of the min-sum algorithm contains two 
scans, the horizontal scan and the vertical scan. This paper 
presents a single-scan version of the min-sum algorithm to speed 
up the decoding process. It can also reduce memory usage or 
wiring because it only needs the addressing from check nodes 
to variable nodes while the original min-sum algorithm requires 
that addressing plus the addressing from variable nodes to check 
nodes. To cut down memory usage or wiring further, another 
version of the single-scan min-sum algorithm is presented where 
the messages of the algorithm are represented by single bit values 
instead of using fixed point ones. The software implementation 
has shown that the single-scan min-sum algorithm is more than 
twice as fast as the original min-sum algorithm. 

I. Introduction 

The sum-product algorithm [1], [2], also known as the 
belief propagation algorithm [3], is the most powerful iter- 
ative soft decoding algorithm for LDPC (low density parity 
check) codes [4], [5], [6]. The normalized/offset min-sum 
algorithm [7], [8], [9], [10] has demonstrated in [9], [10] as 
a good approximation to the sum-product algorithm. It is a 
parallel, iterative soft decoding algorithm for LDPC codes. 
It is simpler in computation than the sum-product algorithm 
because it uses only minimization and summation operations 
instead of multiplication and summation operations used by 
the latter. It is also simpler in computation than the sum- 
product algorithm in the log domain because the latter uses 
non-linear functions. For hardware/software implementations, 
multiplication operations and non-linear functions are, in 
general, more expensive than minimization and summation 
operations. 

Despite of its reduced complexity, we found out, in imple- 
menting the normalized/offset min-sum algorithm for China's 
HDTV, that the min-sum algorithm is still expensive for hard- 
ware/software implementations since two scans are required 
by the algorithm at each iteration, and its convergence rate 
is generally not satisfactory. The min-sum algorithm is also 
not memory efficient. The temporary results of the algorithm 
are stored in memory as fixed point values. The number of 
values is proportional to the number of non-zero elements of 
the parity check matrix of a LDPC code. They require large 



circuit areas because the number of nonzero elements is not 
small in practice. Manipulating those values also takes a lot 
of system time and consumes much of system power at run- 
time. We concluded that further simplification of the min-sum 
algorithm is needed to suit the ever demanding requirements 
of the next generation communication systems. 

This paper presents two simplified versions of the min- 
sum algorithm to increase its decoding speed and reduce the 
requirement on memory usage. Those simplifications are based 
on several obvious observations with some of them already 
mentioned by other researcher [12]. However, no detail has 
been offered in the previous literature in the form of algorithms 
which can be directly used by engineers and practitioners in 
the communication area. Furthermore, the advantage of the 
simplifications is not neglectable because the simplified min- 
sum algorithm more than doubles the decoding speed of the 
standard min-sum algorithm in our software implementation 
(in C language) for decoding the quasi-cyclic irregular LDPC 
codes used for China's HDTV, the irregular LDPC codes used 
for European digital video broadcasting using satellites (DVB- 
S2), and the regular/irregular LDPC codes from Dr. MacKay's 
website. The comparison is fair because both algorithms are 
simple in operations and can be implemented in software in a 
straightforward way without much room for further improve- 
ment. Our HDTV research group at Tsinghua University has 
benefited from the simplifications because we most often use 
software simulations first to test the performance of different 
LDPC codes for China's HDTV which could be very time- 
consuming and may take hours or even days running on fast 
Intel-based desktop computers. 

II. Definitions and notations 

LDPC codes belong to a special class of linear block codes 
whose parity check matrix H has a low density of ones. 
LDPC codes were originally introduced by Gallager in his 
thesis [4]. After the discovery of turbo codes in 1993 by Berrou 
et al. [11], LDPC codes were rediscovered by Mackay and 
Neal [5] in 1995. Both classes have excellent performances 
in terms of error correction close to the Shannon limit. For 
a binary LDPC code, if is a binary matrix with elements, 
denoted as h mn in {0,1}. Let the code word length be N, 
then H is a M x N matrix, where M is the number of rows. 



Each row Hi (1 < i < M) of H introduces one parity check 
constraint on input data x = (x±, x%, . . . , x n ), i.e., 

HiX T = mod 2. 

So there are M constraints on x in total. 

Let N(m) be the set of variable nodes that are included in 
the TO-th parity check constraint. Let Ai (n) be the set of check 
nodes which contain the variable node 77. M(m) \ n denotes 
the set of variable nodes excluding node 77 that are included 
in the 777-th parity check constraint. M. (77) \ m stands for the 
set of check nodes excluding the check node m which contain 
the variable node 77. The symbol '\' denotes the set minus. 

For an additive white Gaussian noise channel and a binary 
modulation, let y n be the received data bit at position n, 

y n = (-l)*» +£„ , 

where is the channel noise. The initial Log-likelihood ratio 
(LLR) for the input data bit n, denoted as Z n °\ is 

z (O)- ln P(Xn = 0/yn) _ 2 / 2 
n - ln p(x n = l/y n )- 2yn/a ' 

where a 1 is the estimated variance of the channel noise. The 
performance of the min-sum algorithm does not depend on the 
channel estimate. We can, thus, set Zn — Vn in practice. 

III. The (Normalized/Offset) Min-sum Algorithm 

The standard min-sum algorithm [7], [8] for decoding 
LDPC is a parallel, iterative soft decoding algorithm. At each 
iteration, messages first are sent from the variable nodes to 
the check nodes, called the horizontal scan. Then messages 
are sent from the check nodes back to the variable nodes, 
called the vertical scan. During each iteration, the a-posteriori 
probability for each bit is also computed. A hard decoding 
decision is made for each bit based on the probability, and 
decoded bits are checked against all parity check constraints 
to see if they are a valid codeword. 

(k) 

At iteration k, let Z n ' be the posteriori LLR for the input 

(k) 

data bit 77. Let Z„ ln denote the message sent from variable 
node 77 to check node m. Zmn is the log-likelihood ratio that 
the 77-th bit of the input data x has the value versus 1, given 
the information obtained via the check nodes other than check 

(k) 

node ?77. Let L„ ln denote the message sent from check node 
m to variable node 77. Lmn is the log-likelihood ratio that the 
check node 777 is satisfied when the input data bit 77 is fixed to 
value versus value 1 and the other bits are independent with 
log-likelihood ratios Z > , 77 S A/"(m) \ 77. The pseudo-code 
for the min-sum algorithm is given as follows. 
Initialization : For 77 £ {1,2,..., iV}, 

Z$ n = Z n °\ for 777 G M(n). 

Iteration 

1) Horizontal scan (check node update rule) : 
For each 777 and each 77 6 M(m), 



2) Vertical scan (variable node update rule) 



l%i = n s s n < z 

n £j\f(m)\n 



(k-ih 



min \Z 

mn ' 1 ~ \n \\ 

n £A/(m)\n 



(fc-1) 



Z (k) = Z (Q) + 



L 



(fc) 



(2) 



3) Decoding : 

For each bit, compute its posteriori log-likelihood ratio 
(LLR) 



= Z<°> + 



E 

m' £M(n) 



L 



(k) 



Then estimate the original codeword x^ as 



(fe ) = J 0, if Z { n k) > 0; 
1, otherwise; 



for 77 = 1,2,. 



If H (x^) T — or the iteration number exceeds some 
cap, stop the iteration and output x^ as the decoded 
codeword. 

One performance improvement to the above standard min- 
sum algorithm is to multiply Lmli computed in Q by a 
positive constant smaller than 1, i.e., 

The min-sum algorithm with such a modification is referred 
to as the normalized min-sum algorithm [9], [10]. 

Another improvement to the standard min-sum algorithm, 
is to reduce the reliability values Lmn computed in Q by a 
positive value (3k, i.e., 

The min-sum algorithm with such a modification is referred to 
as the offset min-sum algorithm [10]. The difference between 
the standard min-sum algorithm and the normalized/offset one 
is minor for software/hardware implementations. 

IV. The Single-Scan Min-Sum Algorithm 

It is very straightforward to rewrite the variable node update 
rule (0 as 

y(k) = yik) _ j (k) 
ran n mn ' 

If we have computed Z^t\ the variable node message Zmn, can 
be obtained from the check node message Lmn- Hence, we can 
merge the horizontal scan and the vertical scan into a single 

(k) 

horizontal scan where only the check node messages L, nn are 
computed directly from Z n k 1 ' and Lmn . In summary, the 
single-scan min-sum algorithm consists of the following major 
steps. 

Initialization : Lmn = 0. 

Horizontal scan (check node update rule) : 

y(k) _ z (0) 

lW = n ^(z n k -^ - l^;P) x 

n £j\f(m)\n 



t min \Z n k ~V 

n £j\f(m)\n 



L {k ~} ] \ , (3) 

mn 1 3 



(l) 



Z ( k ) += L ( k ) 



Decoding : x n k ^ 



0, if Z ( n k) > 0; x n k) 



1, otherwise. 

Compared with the original double-scan min-sum algorithm, 
the single-scan version could not only be possibly faster, but 
also be more memory efficient. We can save memory by 

(k) (k) 

storing Z n 's, which are of N items, instead of Zmn, which 
are of N-d v (average variable node degree) items. For software 
implementations, the single-scan version needs only to store 
the addressing (indexing) from check nodes to variable nodes. 
However, for the original version, both the addressing from 
check nodes to variable nodes and the one from variable nodes 
to check nodes are required. The single-scan version cuts down 
the amount of memory used for addressing by half. Such 
memory saving could be important if the min-sum algorithm 
is implemented in next-generation wireless/mobile computing 
devices where available memory could be very limited. 

Could the memory saving be directly translated into the 
saving of wiring for hardware implementations? It has been 
found that our hardware implementation of the original min- 
sum algorithm for decoding the LDPC codes used for China's 
HDTV takes a significant amount of circuit area (sometimes 
50%) just for implementing the connections from the variable 
nodes to the check nodes and the connections from the 
check nodes to variable nodes. Since the simplified min-sum 
algorithm has only the horizontal scan, the circuit area could 
be reduced if only the connections from the check nodes 
to variable nodes are required. We are now at the stage of 
verifying this statement in our lab. 

V. Further Simplification for the Single-Scan 
Algorithm 

The original min-sum algorithm uses a lot of memory for 

ik) 

storing the variable node messages Zmn and the check node 

(k) 

messages L mn - Although we can use one memory cell to store 
both Zmn and Lmn, we still need J2 n l-^( n )l memory cells 
to store them, one for each non-zero element of the parity 
check matrix H. This statement still holds for the single-scan 
min-sum algorithm where only the check node messages Lmn 
are stored. 

If we use b bits (b = 6 ~ 8 in practice) to store a value 
(containing its sign), then in total they require b ■ J2 n \M(n)\ 
bits. For VLSI implementations, that could take a significant 
amount of circuit area. It also leads to high energy consump- 
tion due to the intensive manipulation (reading/writing) of 
those memory cells at each iteration, two writing operations 
for each cell by the original min-sum algorithm. 

Our simplification comes from the following observation 

(k) 

of the check node messages L mn computed at the horizontal 
scan. The check node messages Lmn is computed using Q 
or (|3}- For the check node messages Lmn of the same check 
node m, all have the same absolute value except one. The first 
absolute value is the minimal absolute value of Lmn Is, for 



equation. This paper offers the detail of exploring this unique 
characteristic to possibly cut down the memory usage of the 
single-scan min-sum algorithm further. 

To be more specific, for the check node messages Lmn of 
the same check node m, let 4„ be the first absolute value, 

A ( m ] = min | . 

neM(m) 



From Eq. Q, we have 



n£J\ [rn 



i^(fc-i) 

I ran I ' 



(k—1) 

which is the minimal value of \Z mn J \s of the same check 
node m. 

(k) 

Let B) n be the second absolute value, 

ee max , 

neM(m) 

and let be the position of Lmn of the second minimal 
absolute value, 

n { m ] =arg max \L$ n \ . 

neJV(m) 



From Eq. ([0, we have 

n« =arg min \Z^\ . 

n£jv(m) 
(k) 

That is, the position of L mn of the second minimal absolute 
value Bm\ rim , is at the position of Zmn. 1 ^ of the minimal 
absolute value. 

From Eq. 0, we also have 

B^ = min \Z<faV\ , 

neN \m)\K&' 

(k) 

which is the second minimal value of \Zmn\s of the same 
check node m, 

(k) 

Hence, for the check node messages L mn of the same check 

node, to save memory, we only need to store the two absolute 

-(k) 

J rnn- 



values, the position of the first one, and the signs of L 

Let the sign of Lm\ be Smn, Smn = sgn(L™i). Let 
f(A, B, n, h) be a function defined as 

f(A,B,n,h) 



B, if n = n; 
A, otherwise. 



The check node message Lmn, can be recovered from its sign 

Ik) a (k) (k) 

Smn, the two absolute values, A) n and B m , and the position 

~(k) 
n m as 

£,(*) = 8 W f( A (k) B (k) -(fch 

In the single-scan min-sum algorithm presented in the 

(fc) 



the same m. The second absolute value is the second minimal previous section, substituting the computation of L m k by the 

computation of Smn, Am\ Bm\ and rim , we have a memory 



absolute value of \Lmn |s. This observation is quite obvious. 
Dr. Guilloud also mentioned in his thesis [12] (section 4.1.2) 
that two messages need to be saved for each parity-check 



efficient version of the single-scan min-sum algorithm. 



Initialization : A^m — Bm^ = n"m = 0, 



(0) 



sgn(L 



Horizontal scan (check node update rule) 
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(4) 
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sg: 



0, if > 0; 



1, otherwise. 



In the memory efficient version of the single-scan min-sum 

(k—X) 

algorithm, the previous check node messages L„ ' of the 
check node m are computed on fly and stored as temporary 

Ik) 

data during the computation of check node messages Lrnn of 
the same check node m. Smn, Am ,Bm\ h m and Zn are 
persistent data stored in memory. 

The single-scan min-sum algorithm is fully equivalent to the 
original min-sum algorithm. To modify the single-scan min- 
sum algorithm to be equivalent to the normalized min-sum 
algorithm, Eq. @ and Eq. (0 should be changed to 



A {k) = 



A fc min |, 

nEJV(m) 



\L 



(fc) 



where is a constant at iteration k satisfying < Afc < 1. 

To modify the single-scan min-sum algorithm to be equiv- 
alent to the offset min-sum algorithm, Eq. (|4j and Eq. l|5} 
should be changed to 



A$ = max( min \L<£> n \ 

neJV(m) 

flW = m ax( max \L$ n \ 

n£/V (ro) 



AO), 
AO) , 



where (3 is a constant, satisfying j3 > 0. 

VI. Summary 

This paper presents the single-scan min-sum algorithm as a 
simplified version of the original (normalized/offset) min-sum 
algorithm for decoding LDPC codes. It merges the horizontal 
scan and the vertical scan in the original min-sum algorithm 
into a single horizontal scan. A memory efficient version of 
the single-scan min-sum algorithm is also presented where 
the check node messages of each check node are stored 
using their signs together with two of the messages of the 
minimal absolute values. All the simplifications are applicable 
for decoding binary LDPC codes. 
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